skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Gao, Shan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract DNA modifications, such as N6-methyladenine (6mA), play important roles in various processes in eukaryotes. Single-molecule, real-time (SMRT) sequencing enables the direct detection of DNA modifications without requiring special sample preparation. However, most SMRT-based studies of 6mA rely on ensemble-level consensus by combining multiple reads covering the same genomic position, which misses the single-molecule heterogeneity. While recent methods have aimed at single-molecule level detection of 6mA, limitations in sequencing platforms, resolution, accuracy, and usability restrict their application in comprehensive epigenetic studies. Here, we present SMAC (single-molecule 6mA analysis of CCS reads), a novel framework for accurately detecting 6mA at the single-molecule level using SMRT circular consensus sequencing (CCS) data from the Sequel II system. It is an automated method that streamlines the entire workflow by packaging both existing softwares and built-in scripts, with user-defined parameters to allow easy adaptation for various studies. By utilizing the statistical distribution characteristics of enzyme kinetic indicators on single DNA molecules rather than a fixed cutoff, SMAC significantly improves 6mA detection accuracy at the single-nucleotide and single-molecule levels. It simplifies analysis by providing comprehensive information, including quality control, statistical analysis, and site visualization, directly from raw sequencing data. SMAC is a powerful new tool that enables de novo detection of 6mA and empowers investigation of its functions in modulating physiological processes. 
    more » « less
    Free, publicly-accessible full text available March 1, 2026
  2. Abstract N6-adenine methylation occurs in both DNA and RNA (referred to as 6mA and m6A, respectively). As an extensively characterized epi-transcriptomic mark found in virtually all eukaryotes, m6A in mRNA is deposited by METTL3-METTL14 complex. As a transcription-associated epigenetic mark abundantly present in many unicellular eukaryotes, 6mA is coordinately maintained by two AMT1 complexes, distinguished by their mutually exclusive subunits, AMT6 and AMT7. These are all members of MT-A70 family methyltransferases (MTases). Despite their functional importance, no structure for holo-complexes with cognate DNA/RNA substrate has been resolved. Here, we employ AlphaFold3 (AF3) and molecular dynamics (MD) simulations for structural modeling ofTetrahymenaAMT1 complexes, with emphasis on ternary holo-complexes with double-stranded DNA (dsDNA) substrate and cofactor. Key structural features observed in these models are validated by mutagenesis and various other biophysical and biochemical approaches. Our analysis reveals the structural basis for DNA substrate recognition, base flipping, and catalysis in the prototypical eukaryotic DNA 6mA-MTase. It also allows us to delineate the reaction pathway for processive DNA methylation involving translocation of the closed form AMT1 complex along dsDNA. As the active site is highly conserved across MT-A70 family of eukaryotic 6mA/m6A-MTases, the structural insight will facilitate rational design of small molecule inhibitors, especially for METTL3-METTL14, a promising target in cancer therapeutics. 
    more » « less
    Free, publicly-accessible full text available July 8, 2026
  3. Abstract Although an established model organism, Tetrahymena thermophila remains comparatively inaccessible to high throughput screens, and alternative bioinformatic approaches still rely on unconnected datasets and outdated algorithms. Here, we report a new approach to consolidating RNA-seq and microarray data based on a systematic exploration of parameters and computational controls, enabling us to infer functional gene associations from their co-expression patterns. To illustrate the power of this approach, we took advantage of new data regarding a previously studied pathway, the biogenesis of a secretory organelle called the mucocyst. Our untargeted clustering approach recovered over 80% of the genes that were previously verified to play a role in mucocyst biogenesis. Furthermore, we tested four new genes that we predicted to be mucocyst-associated based on their co-expression and found that knocking out each of them results in mucocyst secretion defects. We also found that our approach succeeds in clustering genes associated with several other cellular pathways that we evaluated based on prior literature. We present the Tetrahymena Gene Network Explorer (TGNE) as an interactive tool for genetic hypothesis generation and functional annotation in this organism and as a framework for building similar tools for other systems. 
    more » « less
  4. Stable inheritance of DNA N6-methyladenine (6mA) is crucial for its biological functions in eukaryotes. Here, we identify two distinct methyltransferase (MTase) complexes, both sharing the catalytic subunit AMT1, but featuring AMT6 and AMT7 as their unique components, respectively. While the two complexes are jointly responsible for 6mA maintenance methylation, they exhibit distinct enzymology, DNA/chromatin affinity, genomic distribution, and knockout phenotypes. AMT7 complex, featuring high MTase activity and processivity, is connected to transcription-associated epigenetic marks, including H2A.Z and H3K4me3, and is required for the bulk of maintenance methylation. In contrast, AMT6 complex, with reduced activity and processivity, is recruited by PCNA to initiate maintenance methylation immediately after DNA replication. These two complexes coordinate in maintenance methylation. By integrating signals from both replication and transcription, this mechanism ensures the faithful and efficient transmission of 6mA as an epigenetic mark in eukaryotes. 
    more » « less
    Free, publicly-accessible full text available January 21, 2026
  5. Although DNAN6-adenine methylation (6mA) is best known in prokaryotes, its presence in eukaryotes has recently generated great interest. Biochemical and genetic evidence supports that AMT1, an MT-A70 family methyltransferase (MTase), is crucial for 6mA deposition in unicellular eukaryotes. Nonetheless, the 6mA transmission mechanism remains to be elucidated. Taking advantage of single-molecule real-time circular consensus sequencing (SMRT CCS), here we provide definitive evidence for semiconservative transmission of 6mA inTetrahymena thermophila. In wild-type (WT) cells, 6mA occurs at the self-complementary ApT dinucleotide, mostly in full methylation (full-6mApT); after DNA replication, hemi-methylation (hemi-6mApT) is transiently present on the parental strand, opposite to the daughter strand readily labeled by 5-bromo-2′-deoxyuridine (BrdU). In ΔAMT1cells, 6mA predominantly occurs as hemi-6mApT. Hemi-to-full conversion in WT cells is fast, robust, and processive, whereas de novo methylation in ΔAMT1cells is slow and sporadic. InTetrahymena, regularly spaced 6mA clusters coincide with the linker DNA of nucleosomes arrayed in the gene body. Importantly, in vitro methylation of human chromatin by the reconstituted AMT1 complex recapitulates preferential targeting of hemi-6mApT sites in linker DNA, supporting AMT1's intrinsic and autonomous role in maintenance methylation. We conclude that 6mA is transmitted by a semiconservative mechanism: full-6mApT is split by DNA replication into hemi-6mApT, which is restored to full-6mApT by AMT1-dependent maintenance methylation. Our study dissects AMT1-dependent maintenance methylation and AMT1-independent de novo methylation, reveals a 6mA transmission pathway with a striking similarity to 5-methylcytosine (5mC) transmission at the CpG dinucleotide, and establishes 6mA as a bona fide eukaryotic epigenetic mark. 
    more » « less
  6. Oceanic eddies accompanied by a significant vertical velocity ( w ) are known to be of great importance for the vertical transport of various climatically, biologically or biogeochemically relevant properties. Using quasi-geostrophic w -thinking to extend the classic “ β -spiral” w -theory for gyre circulations to isolated and nearly symmetric oceanic mesoscale eddies, we propose that their w motion will be dominated by a strong east-west dipole pattern with deep ocean penetrations. Contrasting numerical simulations of idealized isolated eddies together with w -equation diagnostics confirm that the w -dipole is indeed dominated by the “eddy β -spiral” mechanism in the β -plane simulation, whereas this w -dipole expectedly disappears in the f -plane simulation. Analyses of relatively isolated warm and cold eddy examples show good agreement with the proposed mechanism. Our studies further clarify eddy vertical motions, have implications for ocean mixing and vertical transport, and inspire further studies. 
    more » « less
  7. Katz, Laura A.; Capone, Douglas G. (Ed.)
    ABSTRACT How to achieve protein diversity by genome and transcriptome processing is essential for organismal complexity and adaptation. The present work identifies that the macronuclear genome of Halteria grandinella , a cosmopolitan unicellular eukaryote, is composed almost entirely of gene-sized nanochromosomes with extremely short nongenic regions. This challenges our usual understanding of chromosomal structure and suggests the possibility of novel mechvanisms in transcriptional regulation. Comprehensive analysis of multiple data sets reveals that Halteria transcription dynamics are influenced by: (i) nonuniform nanochromosome copy numbers correlated with gene-expression level; (ii) dynamic alterations at both the DNA and RNA levels, including alternative internal eliminated sequence (IES) deletions during macronucleus formation and large-scale alternative splicing in transcript maturation; and (iii) extremely short 5′ and 3′ untranslated regions (UTRs) and universal TATA box-like motifs in the compact 5′ subtelomeric regions of most chromosomes. This study broadens the view of ciliate biology and the evolution of unicellular eukaryotes, and identifies Halteria as one of the most compact known eukaryotic genomes, indicating that complex cell structure does not require complex gene architecture. 
    more » « less
  8. Abstract Polycomb group (PcG) proteins are widely utilized for transcriptional repression in eukaryotes. Here, we characterize, in the protist Tetrahymena thermophila, the EZL1 (E(z)-like 1) complex, with components conserved in metazoan Polycomb Repressive Complexes 1 and 2 (PRC1 and PRC2). The EZL1 complex is required for histone H3 K27 and K9 methylation, heterochromatin formation, transposable element control, and programmed genome rearrangement. The EZL1 complex interacts with EMA1, a helicase required for RNA interference (RNAi). This interaction is implicated in co-transcriptional recruitment of the EZL1 complex. Binding of H3K27 and H3K9 methylation by PDD1—another PcG protein interacting with the EZL1 complex—reinforces its chromatin association. The EZL1 complex is an integral part of Polycomb bodies, which exhibit dynamic distribution in Tetrahymena development: Their dispersion is driven by chromatin association, while their coalescence by PDD1, likely via phase separation. Our results provide a molecular mechanism connecting RNAi and Polycomb repression, which coordinately regulate nuclear bodies and reorganize the genome. 
    more » « less